6,638 research outputs found

    On the Computational Benefit of Multimodal Learning

    Full text link
    Human perception inherently operates in a multimodal manner. Similarly, as machines interpret the empirical world, their learning processes ought to be multimodal. The recent, remarkable successes in empirical multimodal learning underscore the significance of understanding this paradigm. Yet, a solid theoretical foundation for multimodal learning has eluded the field for some time. While a recent study by Lu (2023) has shown the superior sample complexity of multimodal learning compared to its unimodal counterpart, another basic question remains: does multimodal learning also offer computational advantages over unimodal learning? This work initiates a study on the computational benefit of multimodal learning. We demonstrate that, under certain conditions, multimodal learning can outpace unimodal learning exponentially in terms of computation. Specifically, we present a learning task that is NP-hard for unimodal learning but is solvable in polynomial time by a multimodal algorithm. Our construction is based on a novel modification to the intersection of two half-spaces problem

    Piloting Multimodal Learning Analytics using Mobile Mixed Reality in Health Education

    Get PDF
    © 2019 IEEE. Mobile mixed reality has been shown to increase higher achievement and lower cognitive load within spatial disciplines. However, traditional methods of assessment restrict examiners ability to holistically assess spatial understanding. Multimodal learning analytics seeks to investigate how combinations of data types such as spatial data and traditional assessment can be combined to better understand both the learner and learning environment. This paper explores the pedagogical possibilities of a smartphone enabled mixed reality multimodal learning analytics case study for health education, focused on learning the anatomy of the heart. The context for this study is the first loop of a design based research study exploring the acquisition and retention of knowledge by piloting the proposed system with practicing health experts. Outcomes from the pilot study showed engagement and enthusiasm of the method among the experts, but also demonstrated problems to overcome in the pedagogical method before deployment with learners

    Deep Multimodal Learning for Audio-Visual Speech Recognition

    Full text link
    In this paper, we present methods in deep multimodal learning for fusing speech and visual modalities for Audio-Visual Automatic Speech Recognition (AV-ASR). First, we study an approach where uni-modal deep networks are trained separately and their final hidden layers fused to obtain a joint feature space in which another deep network is built. While the audio network alone achieves a phone error rate (PER) of 41%41\% under clean condition on the IBM large vocabulary audio-visual studio dataset, this fusion model achieves a PER of 35.83%35.83\% demonstrating the tremendous value of the visual channel in phone classification even in audio with high signal to noise ratio. Second, we present a new deep network architecture that uses a bilinear softmax layer to account for class specific correlations between modalities. We show that combining the posteriors from the bilinear networks with those from the fused model mentioned above results in a further significant phone error rate reduction, yielding a final PER of 34.03%34.03\%.Comment: ICASSP 201

    Bridging the Gap Between Informal Learning Pedagogy and Multimodal Learning Analytics

    Get PDF
    Multimodal Learning is happening in different contexts, where different technologies are utilized. In such contexts, different modalities are used by learners in very unstructured ways. These modalities include video, audio, motion, eye tracking, to mention but a few. However, effective applications of Multimodal Learning Analytics remain challenging. Enabling educational technologies are underpinned by various pedagogical models that are designed to provide educational values of technology. Nevertheless, the link between Multimodal Learning Analytics and informal learning pedagogy is not very well established in the literature. Hence, this chapter aims at bridging the gap between multimodal learning analytics research concepts and approaches from one side and key informal learning pedagogical approaches such as self-directed learning and learning theories such as behaviorism and cognitivism from the other side. Establishing this link is expected to pave the ground for insightful and pedagogically informed learning analytics across multiple contexts. In addition, further explanations on what Multimodal learning analytics techniques and challenges are discussed to highlight the key concerns of MMLA applications

    Calibrating Multimodal Learning

    Full text link
    Multimodal machine learning has achieved remarkable progress in a wide range of scenarios. However, the reliability of multimodal learning remains largely unexplored. In this paper, through extensive empirical studies, we identify current multimodal classification methods suffer from unreliable predictive confidence that tend to rely on partial modalities when estimating confidence. Specifically, we find that the confidence estimated by current models could even increase when some modalities are corrupted. To address the issue, we introduce an intuitive principle for multimodal learning, i.e., the confidence should not increase when one modality is removed. Accordingly, we propose a novel regularization technique, i.e., Calibrating Multimodal Learning (CML) regularization, to calibrate the predictive confidence of previous methods. This technique could be flexibly equipped by existing models and improve the performance in terms of confidence calibration, classification accuracy, and model robustness

    THE USE OF WEB-BASED ASSISTANCE IN MULTIMODAL CHEMISTRY LEARNING AT SENIOR HIGH SCHOOL TO IMPROVE STUDENTS’ MOTIVATION

    Get PDF
    The development of information and communication technology (ICT) affects significantly education sector. Nowadays, students are very familiar with ICT, such as computer and Internet which has the advantage to use as media in chemistry learning. The use of student-friendly application of a web based learning management system (LMS) that completed by chemistry multimedia can be benefit for chemistry learning because the multimedia in the web can be accessed anywhere at any time as potential assistance for students. Learning was managed through various learning strategies in the combination of web-based assistance and face-to-face learning in cooperative model of student team achievement division (STAD) which called multimodal learning. This research investigated the effects of multimodal learning on chemistry towards students’ motivation. This research was an experimental research to measure the improvement of students motivation due to multimodal learning. The samples of this research were 2 groups of students, which are experimental group consisting of 30 students and the control group having 31 students. Samples were from grade X of SMA N 7 Purworejo. The difference of chemistry learning between students from the experimental group and control group was learning with and without multimodal learning, respectively. The students' motivation was collected using a questionnaire, observation, and analyzed statistically. Students’ motivation from one group was compared to the another group using independent sample t-test, and the improvement of students’ motivation was analyzed using paired-sample t-test. The result of this research showed that students’ motivation of group with multimodal learning was higher significantly than that without multimodal learning

    Challenges in Representation Learning: A report on three machine learning contests

    Full text link
    The ICML 2013 Workshop on Challenges in Representation Learning focused on three challenges: the black box learning challenge, the facial expression recognition challenge, and the multimodal learning challenge. We describe the datasets created for these challenges and summarize the results of the competitions. We provide suggestions for organizers of future challenges and some comments on what kind of knowledge can be gained from machine learning competitions.Comment: 8 pages, 2 figure
    • …
    corecore